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Coronaviruses comprise a large group of RNA 
viruses with diverse host specificity. The emer- 
gence of highly pathogenic strains like the SARS 
coronavirus (SARS-CoV), and the discovery of 
two new coronaviruses, NL-63 and HKU1, corrob- 
orates the high rate of mutation and recombina- 
tion that have enabled them to cross species 
barriers and infect novel hosts. For that reason, 
the development of broad-spectrum antivirals 
that are effective against several members of 
this family is highly desirable. This goal can be 
accomplished by designing inhibitors against a 
target, such as the main protease 3CL°"° (MP'°), 
which is highly conserved among all coronavirus- 
es. Here 3CL?'° derived from the SARS-CoV was 
used as the primary target to identify a new 
class of inhibitors containing a _ halomethyl 
ketone warhead. The compounds are _ highly 
potent against SARS 3CL?’° with K;’s as low as 
300 nm. The crystal structure of the complex of 
one of the compounds with 3CL”"° indicates that 
this inhibitor forms a thioether linkage between 
the halomethyl carbon of the warhead and the 
catalytic Cys 145. Furthermore, Structure Activ- 
ity Relationship (SAR) studies of these com- 
pounds have led to the identification of a 
pharmacophore that accurately defines’ the 
essential molecular features required for the high 
affinity. 


Abbreviations: CoV, Coronavirus; SARS, severe acute respiratory 
syndrome; HCoV, human coronavirus; MOE, molecular operating environ- 
ment; RTI, respiratory tract infection; NL-63, human coronavirus Nether- 
lands 63; HUK1, human coronavirus HKU1; TCEP, Tris[2-carboxyethyl] 
phosphine; EDTA, ethylenedinitrilotetraacetic acid. 
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Coronaviruses (CoVs) are responsible for more than 30% of all 
respiratory tract infections (RTls), affecting both the upper and 
lower respiratory tract (1)*. Coronaviruses were previously thought 
to cause only benign respiratory infections with infection rates 
peaking during the winter months. The emergence of the most viru- 
lent member of this family, SARS-CoV, was a harsh deviation from 
this belief (2-4). The discovery of two new species of CoV that 
infect humans, NL-63 and HCoV-HKU1 in 2004 and 2005, respec- 
tively, confirm the high rate of mutagenesis and genetic recombina- 
tion within Coronaviridae (5,6). Evolutionary insights gained from 
sequence comparisons of different strains of SARS-CoV have shown 
that the virus has a mutation rate of 8.26 x 10°/nt/day (about one 
third that of HIV) (7). This high rate of mutation often leads to host 
swapping or to the generation of novel CoVs, posing a significant 
challenge to the development of broad-spectrum inhibitors (8). 


An appropriate target for the development of broad-spectrum anti- 
coronavirals should be both indispensible to the viral life cycle and 
conserved among CoVs. The main viral protease 3CL""° plays a key 
role in viral transcription and propagation of progeny virions (3,9— 
12). Inhibition of this enzyme with a general cysteine protease 
inhibitor has been shown to inactivate viral replication in mouse 
hepatitis virus (MHV) and human coronavirus 229E (HCoV-229E) 
infected cells (13,14). 3CL°° is a three-domain cysteine protease, 
which predominantly occurs as a dimer in solution. Previous studies 
have concluded that dimerization is required for the activity of the 
protease (15-18). The catalytic residue Cys 145 and His 41 are 
located in a cleft between the first two domains (Figure 1) that 
comprise a highly conserved active site cavity. Although the overall 
sequence identities of 3CL"° among members of the coronaviral 
family is only 40-50%, the three-dimensional structure of the prote- 
ases are very similar (Figure 1) (19). Sequence conservation is more 
pronounced in certain regions of 3CL"°. One such region identified 
is a cluster of serine residues consisting of Ser 139, Ser144 and 
Ser 147, adjacent to the active site of SARS 3CL"° (20). Subse- 
quent alanine mutagenesis experiments on these serine residues, 
performed in our laboratory, indicated that this cluster plays a major 
role in not only the activity of the protease but also in its ability to 
dimerize (16). 


Figure 1: The aligned structures of SARS 3CL"° (PDB ID 
102W), HCoV 229E 3CL”° (PDB ID 1P9S), IBV 3CL?° (PDB ID 2Q6D), 
and TGEV 3CL°° (PDB ID 1LVOQ) are shown in ribbon representation 
and coloured blue, yellow, red and green respectively. An arrow 
points to the position of the active site cavity. The Ca of each 
structure was used in the alignment and all structures were aligned 
to the SARS 3CL°° structure. The RMSD values from the alignment 
were 0.889 A for HCoV 229E 3CL"” (over 247 Ca), 1.302 A for IBV 
3CL"° (over 247 Ca), and 1.8 A for TGEV 3CL"® (over 272 Cx). 


Another region with a high degree of conservation consists of the 
residues determining substrate specificity in the S1, S2 and S4 
subsites of the active site cavity of 3CL° [(19) and unpublished 
work from our laboratory]. Although the catalytic Cys and His resi- 
dues are absolutely conserved in the main proteases of all CoVs, 
the high similarity of these sites in 3CL"° among members of the 
three coronaviral families is also well established (3,19,21). Sub- 
strate specificity studies have shown high conservation in the res- 
idue preference at the corresponding site of the substrate as well 
(22). Moreover, substrate sequences derived from the N-terminal 
autocleavage site of SARS 3CL’° have been shown to be pro- 
cessed with equal efficiency by proteases from other coronavirus- 
es when compared to SARS-CoV (19). Therefore, the design of 
broad-spectrum inhibitors against coronaviral main proteases based 
on substrate mimetics appears to be a feasible strategy for drug 
development. 
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Recently, substrate mimetics with a trifluoromethyl ketone warhead 
that specifically targeted the active site of SARS 3CL"° were iden- 
tified (23). In an attempt to optimize those leads, the affinity of a 
library of halomethy! ketone compounds with various P1 substitu- 
tions towards 3CL"° were evaluated. Here we show the results of 
the characterization of the five best compounds from this screen. 
The data presented include the determination of the binding mecha- 
nism, the thermodynamic dissection of the process and the selectiv- 
ity against a panel of other proteases. Our results indicate that a 
1000-fold improvement in affinity towards 3CL! can be achieved 
by modifying the halogenation of this warhead and the substitution 
at the P1 position, and by reducing the compound size. Of particular 
interest is Compound 4, which inhibits the protease by forming an 
initial reversible complex followed by a much slower irreversible 
reaction between Cys 145 and the adjacent halomethy! resulting in 
a thioether linkage. The crystallographic structure of 3CL°° in com- 
plex with Compound 4, indicates novel Pn-S, interactions. An accu- 
rate pharmacophore model has been derived from the affinity (Kj) 
profiles of the compounds studied in this work. Experimental valida- 
tion of the predictability of this model performed using a commer- 
cially available compound library indicates that the pharmacophore 
has an effectiveness of 95% in selecting molecules with activity 
better than 1000 ym against SARS 3CL?”. 


Materials and Methods 


Protein purification 

Recombinant SARS 3CL?° was expressed as a soluble fraction in 
BL21 Star DE3 F. coli competent cells (Invitrogen, Carlsbad, CA, 
USA). The construct begins with residue Ser1, and therefore does 
not contain the full N-terminal auto-cleavage site of the protein. 
Cells were grown in LB supplemented with ampicillin (50 pg/mL) at 
37 °C, induced with 1 mm IPTG when the optical density (as deter- 
mined by absorbance at 600 nm) was 0.8 or greater, and harvested 
after 4 h. Cells were re-suspended in lysis buffer (50 mM potassium 
phosphate (pH 7.8), 400 mm NaCl, 100 mm KCI, 10% glycerol, 0.5% 
Triton-X, and 10 mm imidazole). The cells were broken by sonicating 
on ice for short pulses of one second followed by 3 seconds off for 
a total of 16 min. Cell debris was collected by centrifugation 
(20 000g at 4 °C for 45 min). The supernatant was filtered using a 
0.45 um pore size filter (Millipore, Billerica, MA, USA) and applied 
directly to a nickel affinity column (HiTrap Chelating HP, Amersham 
Biosciences, Piscataway, NJ, USA) that had been pre-equilibrated 
with binding buffer (50 mm sodium phosphate, 0.3 m NaCl, 10 mu 
imidazole, pH 8.0). The protease was eluted with a linear gradient 
of 50 mm sodium phosphate, 300 mm NaCl, 250 mm imidazole, pH 
8.0. After elution, the protein was buffer exchanged into 10 mu 
Tris-HCI pH 7.5, and loaded onto a Q-sepharose anion exchange col- 
umn (Amersham Biosciences). The protease was eluted with a gradi- 
ent of 10 mm Tris-HCl, 1m NaCl, pH 7.5. The pooled fractions 
containing 3CL"° were exchanged into storage buffer (10 mm 
sodium phosphate, 10 mm NaCl, 1 mm Tris[2-carboxyethyl] phosphine 
(TCEP), 1 mm EDTA, pH 7.4) and digested for 48h at 4 °C with 
enterokinase (Invitrogen, 0.1 units per 112 yg of protease) to remove 
the N-terminal polyhistidine tag. The enterokinase was removed by 
incubation with EK-away resin (Invitrogen). The reaction mixture was 
passed through a nickel affinity column to remove undigested prote- 
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ase. The protease was exchanged into storage buffer, concentrated 
to 10 mg/mL and used immediately for experiments. The sample 
was more than 95% pure, as assessed by SDS-PAGE. 


Kinetic assay 

The activity of the SARS protease 3CL"° was determined by contin- 
uous measurement Kinetic assays using the fluorogenic substrate 
Dabcyl—Lys—Thr-Ser—Ala—Val—Leu—Gln—Ser—Gly—Phe—Arg—Lys—Me 
Gln—Edans (Genesis Biotech, Taipei, Taiwan). The sequence of the 
peptide which was derived from the N-terminal auto-cleavage site 
of the protease is flanked by fluorescent groups, Dabcyl and Edans 
(24). The increase in fluorescent intensity upon substrate cleavage 
was monitored in a Cary Eclipse fluorescence spectrophotometer 
(Varian) using wavelengths of 355 and 538 nm for the excitation 
and emission, respectively. The experiments were performed in a 
buffer containing 10 mm sodium phosphate, pH 7.4, 10 mm NaCl, 
1 mm TCEP, and 1 mm EDTA. Enzyme activity parameters, K,, and 
Kat Were determined by initial rate measurements of substrate 
cleavage at 25 °C in 2% dimethyl sulfoxide (DMSO). The reaction 
was initiated by adding protease (final concentration 250 nm) to a 
solution of substrate at final concentration of 0-80 um to a total 
volume of 120 yl in a microcuvette. 


Inhibition assay 

Compounds 1, 2, 3, and 5 were purchased from Bachem (Bachem 
Corporation, USA) and Compound 4 was purchased from Fluka 
(Sigma-Aldrich Corporation, St Louis, MO, USA). Inhibition assays 
were performed under the same conditions as described in the 
Kinetic Assay section with increasing concentrations of substrate 
(5-20 ym) in the presence of inhibitor (0—1.5 mm). This data was 
fit to the first-order rate exponential equation: 


Yo 
[P= 1 
Kapp ( 


eke) + D (1) 


where [P] is the product fluorescence, vo is the initial velocity of 
rate of substrate cleavage, D is the displacement term to account 
for the fact that emission is not zero at the start of the assay mea- 
surement, Kapp is the apparent rate constant of the reaction and t 
is the time in seconds. The kp, obtained was plotted as a function 
of the [I] in a linear relationship: 


wo 
Kapp = kg 1 T ats) (2) 


where k3 is the inactivation rate constant, K; is the equilibrium inhi- 
bition constant, K,, is the Michaelis constant and [S] is the sub- 
strate concentration (19,25). Data from these continuous assays 
were analysed using the non-linear regression analysis software 
Origin. 


Isothermal titration calorimetry 
Isothermal titration calorimetry experiments were carried out using 
a high precision VP-ITC titration calorimetric system (Microcal Inc., 
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Northampton, MA, USA). The enzyme solution in the calorimetric 
cell was titrated with inhibitor solutions dissolved in the same buf- 
fer (10 mm sodium phosphate, 10 mum NaCl, 1 mm TCEP, 1 mu 
EDTA, pH 7.4) with a 2% final DMSO concentration at 25 °C. The 
heat evolved after each ligand injection was obtained from the inte- 
gral of the calorimetric signal. In order to compensate for the 
delayed ko of the inhibitor, injections were spaced 1000 seconds 
apart. The heat because of the binding reaction between the inhibi- 
tor and the enzyme was obtained as the difference between the 
heat of reaction and the corresponding heat of dilution. 


Trypsin inhibition assay 

Selectivity of the compounds was determined by measuring their 
ability to inhibit commercially available Bovine pancreatic Trypsin 
(Sigma). Trypsin (final concentration 359 nm) and compound (0- 
100 yum) were pre-incubated for 10 min prior to the start of the 
assay. The reaction was initiated by the addition of the chromo- 
genic substrate, Na-Benzoyl-l-arginine ethyl ester hydrochloride 
(Sigma) with a final concentration of 200 yaw. The change in absor- 
bance at 253 nm was detected using a Cary spectrophotometer 
(Varian). The experiments were performed in 50 mm sodium phos- 
phate, pH 7.0, 5% DMSO at 25 °C. 


Thrombin inhibition assay 

Inhibition of Thrombin by the compounds was evaluated in experi- 
mental conditions similar to that of Trypsin. Human Thrombin (Sigma) 
was pre-incubated with compound for 10 min prior to the start of 
assay measurements. The final Thrombin concentration was 25 nu 
and the compound concentration varied from 0-100 um. The reac- 
tion was initiated by the addition of the chromogenic substrate Sar- 
Pro-Arg p-nitroanilide dihydrochloride (Sigma) with a final concentra- 
tion of 208 wm. The change in absorbance of the substrate upon 
cleavage by Thrombin was monitored at 405 nm over time using a 
Cary spectrophotometer (Varian). Experiments were conducted in 
50 mm Tris, pH 7.4, 100 mm NaCl, and 5% DMSO at 25 °C. 


Calpain inhibition assay 

Inhibition against purified Calpain (BioVision, San Francisco, CA, 
USA) was measured using the fluorescent substrate Ac-Leu—Leu— 
Tyr-AFC. The reaction was initiated by the addition of 0.5 wl of 
enzyme to a mixture containing reaction and extraction buffer that 
were provided by Biovision in the presence of 2% DMSO at 37 °C 
and 5 yl substrate. A Calpain inhibitor, Z-LLY-FMK (provided by 
manufacturer) was used at a final reaction concentration of 100 nm 
as a standard to gauge the inhibition by the compounds in this 
study. The increase in fluorescence upon substrate cleavage was 
monitored using Cary Eclipse fluorescence spectrophotometer (Var- 
ian) using wavelengths of 400 and 505 nm for the excitation and 
emission, respectively. 


Crystallization 

Co-crystals of SARS 3CL°° with Compound 4 were obtained by 
adding 98 ul SARS 3CL° (6.5 mg/mL) in 10 mm Tris-HCl, 0.1 
NaCl, 1mm EDTA, 1mm TCEP pH 7.4 to 2 ul Compound 4 
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(100 mm) dissolved in 100% DMSO. The final concentration of the 
inhibitor was 2 mm. The mixture was incubated at room tempera- 
ture for 15 min to allow for the protein and inhibitor to interact, 
and then was centrifuged for 5 min at 14 000 x g to remove any 
aggregates that had formed. A condition used to crystallize wild- 
type SARS 3CL°° was used as a starting condition (18). The best 
crystals grew in hanging-drop experiments with a 500 yl reservoir 
solution containing 0.7 m sodium malonate (pH 7.0), and 3-5% iso- 
propanol. The drop was made using 2 wl of reservoir solution and 
2 wl of the protein/inhibitor solution. Crystals appeared after 
3 months at room temperature. 


Data collection 

SARS 3CL"°-Compound 4 co-crystals belong to space group P2,2,2 
with cell dimensions a = 106.66 A, b= 45.16 A, and c= 53.96 A. 
Data were collected from a crystal flash-frozen using 10% glycerol 
as cryoprotectant at beam line X6a at the National Synchrotron 
Light Source, Brookhaven National Laboratory. Intensity data were 
integrated, scaled, and reduced to structure factor amplitudes with 
the HKL2000 suite (26) as summarized in Table 4. 


Structure determination and refinement 

The structure of the co-crystal of SARS-3CL"° and Compound 4 
was determined by molecular replacement with the program AmoRe 
(27) using wild-type SARS 3CL""° (pdb ID 2BX4) as the search model 
(28). Restrained refinement of the models was performed using REF- 
MAC (29). Manual building was carried out using O (30) and water 
molecules were placed with Arp-WARP (31). The stereochemistry of 
the model was checked and analysed using PROCHECK and MolPro- 
bity (32). 


Coordinates 

The coordinates for the structure of SARS 3CL?° bound to Com- 
pound 4 have been deposited in the Protein Data Bank (accession 
number 3D62). 


Pharmacophore generation 

Pharmacophore models were generated using the Pharmacophore 
Application module in MOE 2006.0804 (Quebec, Canada). Being a 
binary model, a threshold of 1000 wm was selected based on the 
distribution of K; data of the compounds in our database. Com- 
pounds with activity (KX; in our case) lower than the threshold were 
chosen as active and those with potency higher than the threshold 
were inactive. A low energy multi-conformational database of all 
the compounds in the library were generated using the MMFF94x 
force field, with a cutoff on the strain energy to be <4 kcal/mol. 
The pharmacophore annotation scheme PPCH_ALL, provided by 
MOE, was used to calculate the planar, polar, charged, and hydro- 
phobic features including all hydrophobes for the conformation 
library”. The model was optimized in a training set database of 
22 compounds, with 15 active compounds (Kk; < 1000 um) and seven 
inactive compounds (kK; > 1000 yum). The structure and activity of all 
22 compounds are shown in Table $1 of the Supplementary Mate- 
rial. Based on this breakdown, an ideal model would select only 
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the active compounds (positive controls) from the training database 
and not the inactive compounds (negative controls) thereby provid- 
ing an ideal enrichment factor (Aigeai) of 1.5 based on egn. 3. Flexi- 
ble alignment of the lowest energy conformation of the most active 
compounds led to the identification of their critical ligand features. 
The basic level of this model was the arrangement of the annota- 
tion features in a three-dimensional array as shown in Figure 6A. 
The second level of complexity was further added to this model 
defining the exclusion regions; compounds with features protruding 
into these areas were excluded from the model. The final selection 
criterion was the placement of an external shell which defined the 
maximum conformational space that can be sampled by molecules. 
By gradual refinement, the ability of the model to discriminate 
between active versus inactive compounds was improved. This 
was gauged by the calculation of an observed enrichment factor 
(Robserved) USINg eqn. 3 for the training models and the pharmaco- 
phores were optimized until Aopserved = Aideat- The final model effec- 
tiveness was calculated using eqn. 4. 


(Active Hits) 

p= Total Hits (3) 
( Total Active ) 
Total Database 


R 
Model Effectiveness = 2" x 100 (4) 


ideal 


Virtual screening 

The pharmacophore model was used as a template for virtual 
screening of commercially available databases provided by MOE 
2006.0804 containing a total of 1 000 000 compounds. The screen- 
ing was carried out using MOE and resulted in 40 hits (18). These 
40 compounds were purchased and measured for inhibition against 
SARS 3CL°° in a fluorogenic assay that was explained earlier. Of 
the 40 compounds tested, 38 had a K;< 1000 wm, including two 
false positives with a K;> 1000 um. Based on the actives false 
positives that were selected by the model, Aobserea, Aideal and the 
effectiveness were calculated using eqns 3 and 4 respectively. Prod- 
uct 1 was purchased from ChemDiv (ChemDiy, Inc., San Diego, CA, 
USA), Product 2 was purchased from Sigma (Sigma-Aldrich Corp., 
USA), Product 3 was purchased from ChemBridge (ChemBridge 
Corp., USA), Product 4 was purchased from Florida Center for Het- 
erocyclic Compounds (University of Florida, USA), and Product 5 
was purchased from Interchim (Montlugon, France). 


Results and Discussion 


Catalytic mechanism and lead generation 

Cysteine proteases, like serine proteases, employ a mechanism 
involving the formation of an acyl-enzyme intermediate that is 
hydrolysed via the formation of a tetrahedral adduct (33,34). Accord- 
ing to this mechanism, the thiol group of the catalytic Cys 145 in 
SARS 3CL?"° initiates a nucleophilic attack on the carbonyl flanking 
the sessile peptide bond of the substrate with the imidazole ring of 
the His 41 side chain acting as a general base. This proton is later 
donated to the leaving group of the tetrahedral intermediate 
(35,36). 
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Substrate specificity studies of SARS 3CL?° have indicated that 
the S1 subsite shows preference for Gln at the P1 site of the 
substrate. The large S2 subsite, on the other hand, can accom- 
modate Leu, lle, Phe, Val and Met in the P2 position. The S3 
subsite is not conserved among coronaviruses and the P3 residue 
at this site is generally solvent exposed and therefore not well 
defined. The S4 subsite favours a hydrophobic side chain to fit 
nto this cavity; a position that is usually occupied by Ala in the 
native substrate (18,34). Based upon this information, we previ- 
ously generated a library of substrate mimetics linked to the tri- 
fluoromethyl ketone warhead that showed moderate affinities 
towards 3CL°° (23). The best compound (KNI-30001, shown in 
Figure 2) had Glu at P1, Leu at P2, and Val at P3 and was 
characterized by a Kk; of 116 um. 


Structure-based optimization of compounds 

In order to optimize KNI-30001, the effect of each of the components 
of the scaffold on the overall affinity towards 3CL°° was evaluated. 
Three specific aspects of the scaffold were emphasized: (i) the halo- 
genation of the warhead, (ii) the compound size, and (iii) the substitu- 
tion at the P1 position of the scaffold. Monopeptide, dipeptide and 
tripeptide mimetics with modifications in both the halogen content of 
the warhead and the P1 position were selected and screened against 
3CL° using an in vitro kinetic assay. The results for the best five 
compounds from this screen are shown in Table 1 along with the gen- 
eral scaffold used for optimization. Compounds 1-3 have the same 
monochloromethyl ketone warhead with alterations at the R, site on 
the scaffold (corresponding to the P1 position). Compounds 1, 2 and 
3 have a Phe, naphthalene and a p-fluoro phenyl derivative at the P1 
position respectively. Compound 4 has a monobromomethy! ketone 
warhead and an aliphatic substitution at the P1 position. Compound 5 
was a dipeptide with a formic acid methyl ester at P1 position and 
Val at R3 site (corresponding to the P2 position). 


Inhibition results of the compounds 
The general reaction scheme used to analyse the inhibition kinetics 
is shown below in Scheme 1: 


E*| 


Here the E-I indicates a reversible enzyme (E) and inhibitor (1) com- 
plex which subsequently undergoes an inactivation step to form the 
irreversible E*l complex. The K; for the reversible step is measured 
as the ratio of k,/k,, and kz is the rate-limiting inactivation step. A 
similar scheme was used earlier to measure the inhibition of 
halomethyl ketones to other cysteine proteases (37). 


A possible mechanism for the inactivation step (Scheme 2) is 
thought to be initiated by the thiolate imidazolium ion pair at the 
active site towards the warhead carbonyl, leading to the formation 
of a thiohemiketal complex which subsequently undergoes an alkyl- 
ation reaction to form the irreversible product (38). The halomethyl 
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Figure 2: The chemical structure of KNI-30001. The structures 
were generated using ChemDraw Ultra 6.0 (Cambridge Software). 
The compound consists of a trifluoromethyl ketone warhead with a 
Glu in the P1 position, Leu at P2 and Val at P3. 
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Table 1: Kinetic inhibition data for Compounds 1-5 


R2 
Ra 
Ree ae 
N R, 
H 
Oo 
Compound R, Ry R3 K (nm) — ks («107*/second) 
1 Cl SS Chez 306410 15201 
n fF 
2 
z 
2 Cl i i Chz 371 +15 2.8 + 0.5 
Pe 
4, 
3 Cl ey Cbz 380 + 31 1.8407 


400 + 71 <<0.005 


4 Br Kk Chz 
“ 
oO” 


Cbz-Val 512 + 25 1.6 +0.1 


es 
oN 


Cbz = benzyloxycarbonyl. 
The scaffold shown defines the core structure of the compounds. 
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group (A in Scheme 2) makes the adjacent ketone group more sus- 
ceptible to nucleophilic attack. In the presence of a thiol group, the 
warhead attains the tetrahedral conformation that resembles the 
enzyme—substrate intermediate formed during substrate catalysis (B 
in Scheme 2). Over time, intramolecular rearrangement leads to the 
alkylation with the adjacent carbon of the halomethyl group with 
the final structure (C in Scheme 2). Another possibility is the attack 
of the thiolate ion on the halomethyl carbon adjacent to the war- 
head carbonyl which then leads to the alkylated product. 


Cys 
Cys 


Oo 
( oO. Alkylation Ke Enz-SH 
: <4 XH»C R 
2! 
R > R 


Since Compounds 1-3 have similar warheads, the differences in 
their activity can be attributed only to the changes in the P1 
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Figure 3: Calorimetric titration of SARS 3CL"° with Compound 
4. In this experiment, each peak represents the injection of 10 wL 
of 3CL°° (125 ym) into the calorimetric cell (1.4272 mL) containing 
Compound 4 at a concentration of 8 wm. The experiment was per- 
formed at 25 °C in buffer containing 10 mm sodium phosphate, 
10 mu NaCl, 1 mm TCEP 1mm EDTA, pH 7.4, with a 2% final 
DMSO concentration. 
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Table 2: The selectivity of Compounds 1-5 as measured against 
trypsin, thrombin, and calpain 


Kj K Kj Kj 

against against against against 
Compound 3CLpro (nu) — trypsin (um) ~— thrombin (um) calpain (yu) 
(A) 
1 306 >100 >100 10 
2 371 >100 75 i) 
3 380 >200 150 8 
4 400 >200 >200 15 
5 512 >200 >200 20 

Selectivity Selectivity Selectivity 

Compound against trypsin against thrombin against calpain 
(B) 
1 >300 >300 32 
2 >300 202 24 
3 >500 394 21 
4 >500 >500 38 
5 >300 >400 39 
(A) The measure affinities (K;) of Compounds 1-5 for 3CL°°, trypsin, throm- 
bin and calpain. (B) Selectivity of Compounds 1-5 with respect to trypsin, 


thrombin and calpain. 


position. The compound with the highest inhibitory potency, Com- 
pound 1 had a K; of 306+10nm as shown in Table 1. The 
naphthalene substitution at the P1 position of Compound 2 
reduced the potency to a K; of 371 + 15 nm. The addition of fluo- 
rine at the para position of the phenyl ring at P1 did not change 
the potency of Compound 3 (K; = 380 + 31 nm). Although Gin is 
traditionally present at this position, a hydrophobic moiety is also 
highly tolerated. This tolerance is consistent with previous studies 
where the modifications to the P1 position included a lactam ring 
in the S stereochemistry (19), keto-glutamine analogs with a phe- 
nyl group at P1 (39) and «,6 unsaturated ester (40); all of which 
showed a stark improvement in the inhibitory potency of the com- 
pounds to the protease. Compound 4, with a bromomethyl ketone 
warhead and an aliphatic substitution at the P1 position had a K; 
value of 400 + 71 nm and may also bind in a similar conforma- 
tion. Compound 5, with a monofluoromethyl ketone warhead had 
a kK; value of 512 + 25 nm. Altering the methyl ester at the P1 
position of this compound into a carboxylic acid, completely ren- 
dered it inactive with a K; > 1000 ym (Table S1 in Supplementary 
Material). This indicated that the larger footprint occupied by the 
dipeptidic compounds leads to an altered conformation of the 
compound when compared to monopeptidic compounds. The P1 
substitution in this orientation fits into a hydrophobic pocket that 
is more sensitive to structural changes. This hydrophobic moiety 
provides additional van der Waals interactions at this site that 
improve the affinity of the ligands. 


Previous reports have shown that halomethylketone compounds 
react with thiols to form thioethers (37). It is also well documented 
that these warheads form methyl phosphonium salts with reducing 
agents such as phosphines. Phosphinomethyl ketone compounds 
were previously shown to inhibit cysteine and serine proteases (41). 
This raised the possibility that the compounds presented in this 
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study may be interacting with the protease due to the reaction with 
tris(2-carboxyethyl)phosphine (TCEP) present in the buffer. However, 
no loss in inhibition was observed in kinetic measurements of Com- 
pounds 1-5 performed in the absence of TCEP. This result indicates 
that the active species in the inhibition of the protease is the 
halomethyl ketone and not the phosphinomethyl ketone. 


Altering the halogen substitution of the warhead also had a sub- 
stantial effect on the reactivity of the inhibitors. Substrate ana- 
logues with chloromethyl ketones have been shown to inhibit 
Cathepsin B and Papain whereas fluoromethyl ketones have shown 
activity against Caspases, Calpains and Cathepsin B as well (37,42— 
44). Furthermore, NMR experiments have indicated that fluorometh- 
yl and chloromethyl ketones are able to activate the carbonyl car- 
bon of the ketone facilitating the formation of a thio-hemiketal at 
the active site of cysteine proteases (37). A high rate of inactivation 
(ks) is related to the ability of the warhead to accept a nucleophilic 
attack by the thiol side chain leading to the eventual alkylation. 
The inactivation constant k for Compound 2 (2.8 + 0.5 x 10-?/sec- 
ond) was almost twice that of Compound 1 (1.5 + 0.1 x 107?/sec- 
ond) and, 3 (1.8 +0.7 x 10°*/second) (Table 1). The larger P1 
moiety of Compound 2 may orient the warhead in a conformation 
more favourable for reacting with the thiol side chain of Cys 145. 
The inactivation constant of the dipeptide compound 5 was 
1.6 + 0.1 x 10°*/second, which was similar to Compounds 1 and 
3. The ks of Compound 4 was too small to be measured accurately 
in the kinetic assay (<0.005 x 1072/second), indicating that the 
irreversible step is very slow and that for several hours the com- 
pound behaves as a reversible inhibitor. In fact, the activity of the 


Table 3: Crystallographic data collection and refinement statis- 
tics 


Data collection 
Space group P2,2,2 
Cell dimensions 106.66, 45.16, 53.96 


a, b,c (A) 
Resolution (A) 50.00-2.70 (2.80-2.70)* 
sym OF Rmerge 0.077(0.505) 
I/ol 29.3 (3.9) 
Completeness (%) 98.8 (98.4) 
Redundancy 5.0 (5.7) 
X-ray Source X6a 
Wavelength 0.9537 
Refinement 
Resolution (A) 50.0-2.70 
No. reflections 7.138 
Ryork/ Firee 0.259/0.362 
(0.299/0.497) 
Number of atoms 
Protein 2.350 
Ligand/ion 5 
Water 53 
B-factors 
Protein 36.67 
Ligand/ion 49.22 
Water 48.03 
RMS deviations 
Bond lengths (A) 0.009 
Bond angles (°) 112 


*Parenthesis indicate values for the highest resolution shell. 
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Table 4: Pharmacophore validation® 


Compound Structure MW — K; (um) 
Product 1 241.67 4521 
No cl 
| o# | 
N 
| aE o 0 
Product 2 O2N 409.01 13 +3 
TF ° Br : 
sorty 
y | 
Br ° 
Product 3 404.34 27+2 
fo) fe) a 
# 
- Sp CF3 
N 
° 
Product 4 449.14 23+4 
ee 


rey NH ° Br 
H 
N 
Nn 
H 
fo) 


Product5 OR a 323.29 34.549 
H 
a ae 
| ° 0 


“The pharmacophore was screened through 1 000 000 compounds to result in 
40 hits. Only the kK; values of the best compounds from the hits are shown. 


protease was completely recovered after incubation in 20 yum Com- 
pound 4 (50 times the K;) for 10 min. In the case of complete irre- 
versible inhibition, the protease would have been inactivated at 
such high inhibitor concentrations. In the case of reversible interac- 
tion however, activity can be recovered when the inhibitor concen- 
tration is diluted. Irreversibility in the enzyme activity was only 
noticed after incubation times exceeding 12 hours at a high concen- 
tration of Compound 4 (data not shown). 


This reversible interaction of the bromomethyl ketone warhead of 
Compound 4 was somewhat unexpected as bromine derivatives are 


generally better leaving groups and therefore more reactive than 


Chem Biol Drug Des 2008; 72: 34-49 


Novel Inhibitors of SARS Protease 3CL?’° 


Figure 4: Location of the Compound 4 binding site. (panel A) Crystal structure of SARS 3CL°° bound to Compound 4 showing the location 
of the ligand within the SARS protease monomer. The protease is depicted in ribbon representation with the catalytic residues His41 and 
Cys145 shown as yellow sticks. The ligand is shown in stick representation and is coloured by atom type with oxygen in red, nitrogen in blue, 
and carbon in green. (Panel B) Key residues involved in the protein—ligand interaction. The backbone of the protein is shown as blue ribbon. 
The side chains of the protein and ligand are coloured by atom type with oxygen in red, nitrogen in blue, sulfur in tan, and carbon in either 
green or yellow for the ligand and protein, respectively. (panel C) Orientation of the ligand within the active site cavity of SARS 3CL”°. The 
ligand is depicted in stick representation and is coloured by atom type as in (panel A) and (panel B). The protein is shown in surface repre- 
sentation with each of the defined substrate subsites highlighted. S1 is shown in red, S2 in rose, and S4 in blue. Cys145 is depicted in yel- 


low. Non-subsite residues are shown in grey. 


chlorine or fluorine derivatives in SN2 reactions with nucleophiles 
such as thiol in cysteine proteases. The results from this study 
pointed to a much lower reactivity of this warhead with the forma- 
tion of a reversible complex followed by an irreversible alkylation. 
Also, the inactivation constants (k3) of the compounds in this study 
are lower than those reported for other dipeptidyl halomethyl 
ketones against human Cathepsin B (37). However, the rate of 
inactivation is dependent on the orientation of the compound 
warhead in the active site (37), which is different for 3CL"”. 


Isothermal titration calorimetry 

The binding energetics of compounds 1-5 were also determined by 
isothermal titration calorimetry (ITC). The calorimetric titrations of 
compounds 1-3 and 5 were characterized by very large reaction 
heats (—18 kcal/mol), consistent with the formation of a covalent 
complex, as expected from the fast irreversible rates (k3) for these 
compounds. Compound 4, on the other hand, had a very different 
thermodynamic signature, consistent with the observation that the 
irreversible step of this compound is extremely slow and character- 
ized by a time constant (1/k,) larger than 300 min, i.e. within the 
calorimeter the binding reaction occurs under equilibrium conditions. 
Figure 3 shows the calorimetric titration of Compound 4 to 3CL?™. 
The binding of Compound 4 is characterized by a small favourable 
binding enthalpy (AH = —1.6 kcal/mol) and a favourable entropic 
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contribution (—TAS = —6.7 kcal/mol) resulting in an overall Gibbs 
energy of —8.3 kcal/mol. The dissociation constant (Ky) determined 
calorimetrically amounts to 800 nu, which is close to the K; value 
estimated from the kinetic inhibition data. 


Inhibitor selectivity 

Selectivity is a measure of the affinity of a compound against its 
intended target versus its affinity against other proteins, especially 
those belonging to the same class. Selectivity is defined as the fol- 
lowing ratio: 


K; towards unwanted target 
K; towards intended target 


Selectivity = 


(8) 


Halomethyl ketone warheads have been shown to selectively target 
serine and cysteine proteases. Trypsin and thrombin are serine pro- 
teases that play a major role in the digestive system and the 
blood-clotting cascade respectively. The first two domains of 3CL°° 
have an antiparallel f-barrel structure reminiscent of the chymo- 
trypsin fold, which is also observed in picornavirus 3C proteases. 
Furthermore, the crystal structure of 3C protease from rhinovirus-14 
showed the presence of two topologically equivalent six-stranded 
B-barrels that were similar to trypsin-like serine proteases such as 
thrombin (45). In order to investigate the selectivity of the 
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A 


CH,Br 


Figure 5: Structure of Compound 4. (panel A) The chemical 
structure of Compound 4 is shown with the atoms of the ligand 
observed in the crystal structure coloured in black and the atoms 
not observed in red. (panel B) The 2F,—F, electron density map cor- 
responding to Compound 4 bound to Cys145 is shown, contoured to 
1 sigma. The protein and ligand are coloured by atom type with 
oxygen in red, nitrogen in blue, sulphur in yellow, and carbon in 
either green or blue for the ligand and protein, respectively. 


inhibitors, their ability to inhibit serine proteases such as trypsin 
and thrombin, as well as a cysteine protease, calpain was tested. 
The results from these experiments are shown in Table 2a and b. 
The inhibitors were found to be highly selective towards SARS 
3CL°° when compared to the other three proteases. None of the 
inhibitors were active against trypsin even though the inhibition 
was measured until the solubility limit of each inhibitor was 
reached. Similar results were also obtained for inhibition measured 
against thrombin. In this case, only the K; values for Compounds 2 
and 3 were within detection limits and had values of 72 + 20 and 
150 + 30 yu, respectively, indicating that these compounds were 
200 and 400 times more selective towards 3CL°° than thrombin 
(Table 2b). 


The compounds showed a higher affinity for the cysteine protease 
Calpain compared to trypsin or thrombin. The chloromethyl ketones 


(Compounds 1-3) had kK; values against Calpain of 10, 9, and 8 ym 
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respectively (Table 2a). Compound 4 had a K; of 15 um against 
Calpain and Compound 5 had a K; of 20 wm. The calculated selec- 
tivity was 32, 24, 21, 38 and 39 for Compound 1, 2, 3, 4 and 5 
respectively towards 3CL""° when compared to Calpain (Table 2b). 
The halomethyl ketone compounds had lower affinities towards Cal- 
pain despite the fact that the warhead is reactive towards the thiol 
side chain in the active site of cysteine proteases. As is the case 
for other type of warheads, the affinity of the compounds is not 
dominated by the reactive warhead but the interactions of the 
entire compound with the residues in the binding cavity. This obser- 
vation indicates that these compounds can be further optimized to 
improve their potency and specificity towards 3CL°. 


Compound 4 binds to SARS 3CL”"’ through a 
thioether attachment 

The structure of the wild-type SARS 3CL°° bound to Compound 4 
was determined using X-ray crystallography. Crystallization condi- 
tions were similar to those used before for the wild-type protease 
(18), although the length of time required for crystals to form in the 
presence of the ligand was longer than for the free protease (three 
months versus one week). The structure was determined by molecu- 
lar replacement using SARS 3CL"° (PDB ID 2BX4) (28) as a search 
model. Data collection and refinement statistics are summarized in 
Table 3. Crystals belong to the space group P2,2,2 and contained 
one monomer of SARS 3CL°° per asymmetric unit. The final model 
was refined to a resolution of 2.7 A with an Avalue of 25.9% 
(Firee = 36.2%). As observed in other structures of SARS 3CL"’°, the 
protein is organized in three domains with domains | and Il (resi- 
dues 3-184) structured into a chymotrypsin-like double f-barrel 
which forms the active site cavity of the protein. The catalytic 
domains are linked by a long loop (residues 185-200) to domain Ill 
(residues 201-306) of the protein which is comprised predominantly 
of «-helices. There are no significant differences between the ligan- 
ded and unliganded structures of the protease (PDB ID 2BX4) (28) 
as evidenced by their superposition which yields an RMSD of 
0.405 A over 277 C* atoms. 


The inhibitor binds within the substrate-binding cleft formed by the 
chymotrypsin-like fold of the enzyme (Figure 4A). Electron density 
for only a portion of the compound was observed in the final struc- 
ture with the thiol of Cys 145 covalently bound to the carbon adja- 
cent to the warhead carbonyl that was originally bonded to the 
bromine. Density for the portion of the compound corresponding to 
the Ro (or P1) moiety of the ligand was missing. Figures 5A and 5B 
show the chemical structure of Compound 4 and the electron den- 
sity in the 2F,-F, map. The absence of electron density for these 
atoms is consistent with mass spectrometry experiments which indi- 
cate that the compound begins to degrade after 6 h in solution, 
reflecting a mass consistent with a loss of the tert-butyl group in 
this position (data not shown). The remainder of the Ra group may 
not be visible in the structure because of high flexibility in that 
region of the compound. 


The key residues involved in the protein-ligand interaction are 
shown in Figure 4B. Compound 4 forms a 1.7 A thioether attach- 
ment between the carbon that was originally bonded to bromine 
and the S’ of Cys145. In addition, the backbone amide of Gly 
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Table 5: Structure and activity 
of the compounds in the training 
set used to generate the pharma- 
cophore are shown 
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Compound 


Name Compound Structure K; (um) 
1 0.31 + 0.01 
fe) 
IL 
fe) N 
H 
fo) 
2 0.37 + 0.01 
fe) 
ak is CH,Cl 
fe) N 
H 
fe) 
3 F 0.38 + 0.03 
fe) 
jeer 
fe) N 
H 
fe) 
4 22 0.40 + 0.07 
fe) 
° ie) 
pe 5 CH2Br 
fe) N 
H 
fe) 
5 No 0.51 + 0.02 
fe) le) 
H 
fe) Nv CH2F 
- N 
oe: 
O a fe) 
6 25.7 + 4.6 
fe) 
H 
fe) Nv ig) CH2F 
: N 
oe. 
O 5 0 
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Table 5: Continued 


Compound 
Name Compound Structure K; (um) 
7 6 644115 
H 
fe) Nv 6) 5 CHF 
Tr 
i H 
fe) = O 
8 105 + 16 
fe) 
fe) N 
> \ 
fo) 
9 OH 116 + 13 
fe) 
: na 3 
H 
‘s) ask. 6 CFs 
‘i : N 
o) i i H 
fe) x fe) 
10 128 + 44 
fe) 
de Ik BX (9) Chee! 
fe) N 
“4 
OH 
11 134.5 + 32 
° ° 
H 
AAAS 
i N H CFs 
2 iW 2 
a 


fo) NHp 
2 ie) 286 + 62 
JA, 
SUCCES 
e) (e) 
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Compound 


Name Compound Structure K, (um) 
13 366 + 51 
H H 
O Oo 
CF, 
14 f 389 + 97 
OH 
H 
(S) aN" 6) oO CHCl 
N : N 
H i H 
HN O = fe) 
HO’ na 
fe) 
15 ; re 844 + 120 
‘ fe) 
CF, 
J, nme 
) i H 
fo) 
16 7 ve >1000 
é fo) 
H 
A 7 "No o CFs 
N : N 
Oo H 7 H 
17 >1000 
o N eo a 
H 
i. al 
O 
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Table 5: Continued 


Compound 
Name Compound Structure K; (um) 
18 HO. >1000 
= fo) 
H = H 
N 5 N CFs 
Ay aoa 
H 
COOH O fo) 
19 >1000 
H 
N CFs 
(R) (S) tT 
0 fe) 
—_ Oo HO 
20 on ye >1000 
fe) NH 
Sah 
fo) fo) 
21 OH >1000 
° 
H 
fe) Nv ig) CH2F 
TY T° 
7 H 
fe) A fe) 
ia 
22 >1000 


fe) 
H 
N CF 
oO N 
H 
fo) OH 


Compounds with kK; < 1000 ym were assigned as active and compounds with Kj > 1000 um as inactive. 


143 is positioned to donate a 2.7 A hydrogen bond to the oxy- 
gen atom of the warhead ketone. The Cbz group of the ligand 
fits well into the S2 subsite as shown in Figure 4C, making 
non-bonding contacts to the side chains of His 41, Met 49, Met 
165, and the main chain Ca atom of Arg 188. The observation 
hat the ligand covalently interacts with the enzyme is consistent 
with time-dependent inhibition experiments which indicate a 
bimodal mode of inhibition for this compound with the formation 
of a reversible complex followed by rearrangement to an irre- 
versible complex after periods of time exceeding 6h (Table 1). 
Together, these observations suggest that the initial binding 


+ 
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event is likely followed by the nucleophilic attack of the S’ of 
Cys 145 on the carbonyl carbon of the warhead, resulting in the 
formation of a reversible tetrahedral complex which is subse- 
quently followed by the slow, irreversible rearrangement to the 
thioether. The bromine is not observed in the structure because 
it is the leaving group during the rearrangement. 


Structure-activity relationships 
As an optimization tool, a pharmacophore model was generated 
based on the activity data of the training database containing the 
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halomethylketone compounds. The pharmacophore model defines 
the orientation of essential features that make up the scaffold of 
the library of lead compounds. The viability of each model during 
the optimization process was determined by accessing the ability to 
discriminate between true and false positives. In order to generate 
an accurate model, maximum structural diversity from the com- 
pounds in the training set was incorporated by setting a threshold 
value of 1000 wm. The pharmacophore model was generated based 
on this activity data combined with the structural information of the 
compounds (Table 5). The final, refined pharmacophore model as 
shown in Figure 6A contained the following features: A non-planar 
hydrophobic feature with a sphere radius of 1.1 A (red), a planar 
donor feature with a sphere radius of 1.2A (purple), a planar 
hydrophobic or a non-planar hydrophobic feature with a sphere 
radius of 1.2 A (blue), a planar acceptor or planar donor feature 


A 


Figure 6: The 3D-pharmaco- 
phore query that was used for vir- 
tual screening. (panel A) The 
features of the pharmacophore are 
shown as spheres in dot configu- 
ration along with the external sh- 
ell which is shown in grey dot 
configuration. The aplanar hydro- 
phobic feature is rendered in red, 
planar hydrophobic feature in gre- 
en, planar donor feature in purple, 
planar or aplanar hydrophobic fea- 
ture in blue and planar donor and 
acceptor in yellow. (panel B) The 
3D-pharmacophore query aligned 
with the selected conformation of 
Compound 4 is shown. The confor- 
mation of the selected molecule 
was dictated by the orientation of 
the spheres representing each ph- 
armacophore feature. The warhead 
feature is specified by the aplanar 
hydrophobic feature (red), P1 resi- 
due by the planar donor (purple) 
and a planar or aplanar hydropho- 
bic feature (blue), compound back- 
bone by a planar donor or 
acceptor (yellow), and the terminal 
group is defined by a planar hydr- 
ophobic feature (green). 
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with a sphere radius of 1.3 A (yellow), and a planar hydrophobic 
feature with a sphere radius of 1.5 A (green). This model selected 
only the active molecules in the training database providing a 
Aobserved Value of 1.5. For the selected threshold, the theoretical 
effectiveness of this model, calculated as the ratio of the Aobserved 
and Aiea (eqn 4) was 100%. The applicability of this model can 
be better visualized by examining a molecule selected on the basis 
of the structural constraints imposed by the model (Figure 6B). A 
non-planar hydrophobic feature defined the region corresponding to 
the warhead of the molecule. The structural features defining the 
P1 subsite were a planar hydrophobic or a non-planar hydrophobic 
feature. This result agrees with the kinetic data discussed earlier 
and indicates that the compounds with the highest affinity to 3CL°”° 
have either a phenyl planar moiety or a flexible aliphatic chain 
(Table 1). 
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Library screening for model validation 

The pharmacophore model defined above was validated by using 
a conformational database of approximately 1000 000 com- 
pounds provided by MOE. The pharmacophore search of the test 
set resulted in the selection of 40 molecules as hits. The 
hypothesis for this binary model was that the selected molecules 
would inhibit 3CL°° with a K, of 1000 wm or lower; the unse- 
lected molecules will have a K; greater than the threshold. 
Based on this hypothesis, the Ageai was calculated to be 
26315.8 using eqn 3. A common feature of the compounds that 
were hits was the presence of a halomethyl ketone group pos- 
sessing an adjacent aromatic moiety. When the ability of the 40 
selected compounds to inhibit 3CL’° was measured, 38 out of 
40 had K;s below 1000 pM and 2 were false positives with Ks 
greater than the threshold resulting in a Aobservea Of 25 000. 
Based on these results, an observed effectiveness of the model 
to 95% was calculated. The data from the best compounds from 
this validation step are shown in Table 4. These results provided 
evidence to the accuracy of the pharmacophore model. The com- 
pound with the highest affinity had a kK; of 4.5 +1 gm which is 
encouraging, based on the fact that the compound was chosen 
from a database not specialized for halomethyl ketones, using a 
pharmacophore that had a broad threshold. This initial model 
performed well in both the training and a randomly chosen test 
set databases, establishing itself as an attractive scaffold that 
can be further optimized. Some unexpected warheads with mod- 
erate affinities such as Product 1 and Product 2 in Table 4 were 
also recovered. In the first optimization cycle, the activity data 
from this validation step was used to further refine the pharma- 
cophore by reducing the threshold limit and consequently the 
sensitivity of the model. Eventually, a highly specialized pharma- 
cophore will be developed that would select for high affinity 
compounds against 3CL"° with features defined in the model. 


Conclusions 


The identification of inhibitors targeted towards the highly con- 
served main protease 3CL?° (MP) of coronaviruses is an impor- 
tant step towards the development of new classes of antivirals. 
In this paper, we have shown that halomethyl ketones can be 
potent and selective inhibitors of the SARS protease 3CL"°. 
While inhibitors like Compound 4 end up forming a covalent thi- 
oether complex, they do so in a very slow fashion (over 6 h) 
allowing the binding reaction to be controlled by reversible ther- 
modynamic interactions. Compound 4 binds with favourable 
enthalpy and entropy changes. Since Compound 4 has a molecu- 
lar weight of only 400.26 Da, it has the potential for much 
improved potency and selectivity. 
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